skip to main content


Search for: All records

Creators/Authors contains: "Huang, Chao"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. ABSTRACT

    The aim of this paper is to systematically investigate merging and ensembling methods for spatially varying coefficient mixed effects models (SVCMEM) in order to carry out integrative learning of neuroimaging data obtained from multiple biomedical studies. The ”merged” approach involves training a single learning model using a comprehensive dataset that encompasses information from all the studies. Conversely, the ”ensemble” approach involves creating a weighted average of distinct learning models, each developed from an individual study. We systematically investigate the prediction accuracy of the merged and ensemble learners under the presence of different degrees of interstudy heterogeneity. Additionally, we establish asymptotic guidelines for making strategic decisions about when to employ either of these models in different scenarios, along with deriving optimal weights for the ensemble learner. To validate our theoretical results, we perform extensive simulation studies. The proposed methodology is also applied to 3 large-scale neuroimaging studies.

     
    more » « less
  2. Free, publicly-accessible full text available January 1, 2025
  3. Large-scale imaging studies often face challenges stemming from heterogeneity arising from differences in geographic location, instrumental setups, image acquisition protocols, study design, and latent variables that remain undisclosed. While numerous regression models have been developed to elucidate the interplay between imaging responses and relevant covariates, limited attention has been devoted to cases where the imaging responses pertain to the domain of shape. This adds complexity to the problem of imaging heterogeneity, primarily due to the unique properties inherent to shape representations, including nonlinearity, high-dimensionality, and the intricacies of quotient space geometry. To tackle this intricate issue, we propose a novel approach: a shape-on-scalar regression model that incorporates confounder adjustment. In particular, we leverage the square root velocity function to extract elastic shape representations which are embedded within the linear Hilbert space of square integrable functions. Subsequently, we introduce a shape regression model aimed at characterizing the intricate relationship between elastic shapes and covariates of interest, all while effectively managing the challenges posed by imaging heterogeneity. We develop comprehensive procedures for estimating and making inferences about the unknown model parameters. Through real-data analysis, our method demonstrates its superiority in terms of estimation accuracy when compared to existing approaches.

     
    more » « less
    Free, publicly-accessible full text available December 1, 2024
  4. Free, publicly-accessible full text available August 24, 2024
  5. Abstract

    Functional data analysis (FDA) is a fast-growing area of research and development in statistics. While most FDA literature imposes the classical$$\mathbb {L}^2$$L2Hilbert structure on function spaces, there is an emergent need for a different, shape-based approach for analyzing functional data. This paper reviews and develops fundamental geometrical concepts that help connect traditionally diverse fields of shape and functional analyses. It showcases that focusing on shapes is often more appropriate when structural features (number of peaks and valleys and their heights) carry salient information in data. It recaps recent mathematical representations and associated procedures for comparing, summarizing, and testing the shapes of functions. Specifically, it discusses three tasks: shape fitting, shape fPCA, and shape regression models. The latter refers to the models that separate the shapes of functions from their phases and use them individually in regression analysis. The ensuing results provide better interpretations and tend to preserve geometric structures. The paper also discusses an extension where the functions are not real-valued but manifold-valued. The article presents several examples of this shape-centric functional data analysis using simulated and real data.

     
    more » « less
  6. Abstract

    The pork industry is an essential part of the global food system, providing a significant source of protein for people around the world. A major factor restraining productivity and compromising animal wellbeing in the pork industry is disease outbreaks in pigs throughout the production process: widespread outbreaks can lead to losses as high as 10% of the U.S. pig population in extreme years. In this study, we present a machine learning model to predict the emergence of infection in swine production systems throughout the production process on a daily basis, a potential precursor to outbreaks whose detection is vital for disease prevention and mitigation. We determine features that provide the most value in predicting infection, which include nearby farm density, historical test rates, piglet inventory, feed consumption during the gestation period, and wind speed and direction. We utilize these features to produce a generalizable machine learning model, evaluate the model’s ability to predict outbreaks both seven and 30 days in advance, allowing for early warning of disease infection, and evaluate our model on two swine production systems and analyze the effects of data availability and data granularity in the context of our two swine systems with different volumes of data. Our results demonstrate good ability to predict infection in both systems with a balanced accuracy of$$85.3\%$$85.3%on any disease in the first system and balanced accuracies (average prediction accuracy on positive and negative samples) of$$58.5\%$$58.5%,$$58.7\%$$58.7%,$$72.8\%$$72.8%and$$74.8\%$$74.8%on porcine reproductive and respiratory syndrome, porcine epidemic diarrhea virus, influenza A virus, andMycoplasma hyopneumoniaein the second system, respectively, using the six most important predictors in all cases. These models provide daily infection probabilities that can be used by veterinarians and other stakeholders as a benchmark to more timely support preventive and control strategies on farms.

     
    more » « less
  7. Recent advances in structural DNA nanotechnology have been facilitated by design tools that continue to push the limits of structural complexity while simplifying an often-tedious design process. We recently introduced the software MagicDNA, which enables design of complex 3D DNA assemblies with many components; however, the design of structures with free-form features like vertices or curvature still required iterative design guided by simulation feedback and user intuition. Here, we present an updated design tool, MagicDNA 2.0, that automates the design of free-form 3D geometries, leveraging design models informed by coarse-grained molecular dynamics simulations. Our GUI-based, stepwise design approach integrates a high level of automation with versatile control over assembly and subcomponent design parameters. We experimentally validated this approach by fabricating a range of DNA origami assemblies with complex free-form geometries, including a 3D Nozzle, G-clef, and Hilbert and Trifolium curves, confirming excellent agreement between design input, simulation, and structure formation.

     
    more » « less
    Free, publicly-accessible full text available July 28, 2024
  8. Free, publicly-accessible full text available July 29, 2024
  9. Free, publicly-accessible full text available July 23, 2024